AITopics | conversational ai system

Collaborating Authors

conversational ai system

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

DICES Dataset: Diversity in Conversational AI Evaluation for Safety

Neural Information Processing SystemsDec-26-2025, 12:21:50 GMT

Machine learning approaches often require training and evaluation datasets with a clear separation between positive and negative examples. This requirement overly simplifies the natural subjectivity present in many tasks, and obscures the inherent diversity in human perceptions and opinions about many content items. Preserving the variance in content and diversity in human perceptions in datasets is often quite expensive and laborious. This is especially troubling when building safety datasets for conversational AI systems, as safety is socio-culturally situated in this context. To demonstrate this crucial aspect of conversational AI safety, and to facilitate in-depth model performance analyses, we introduce the DICES (Diversity In Conversational AI Evaluation for Safety) dataset that contains fine-grained demographics information about raters, high replication of ratings per item to ensure statistical power for analyses, and encodes rater votes as distributions across different demographics to allow for in-depth explorations of different aggregation strategies. The DICES dataset enables the observation and measurement of variance, ambiguity, and diversity in the context of safety for conversational AI. We further describe a set of metrics that show how rater diversity influences safety perception across different geographic regions, ethnicity groups, age groups, and genders. The goal of the DICES dataset is to be used as a shared resource and benchmark that respects diverse perspectives during safety evaluation of conversational AI systems.

conversational ai evaluation, dataset, diversity, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Temporal Context Awareness: A Defense Framework Against Multi-turn Manipulation Attacks on Large Language Models

Kulkarni, Prashant, Namer, Assaf

arXiv.org Artificial IntelligenceMar-18-2025

Temporal Context A wareness: A Defense Framework Against Multi-turn Manipulation Attacks on Large Language Models Prashant Kulkarni ORCID: 0009-0004-2344-4840 Mountain View, CA Assaf Namer ORCID: 0009-0008-5579-0544 Mountain View, CA Abstract --Many Large Language Models (LLMs) today are vulnerable to multi-turn manipulation attacks,where adversaries gradually build context through seemingly benign conversational turns to elicit harmful or unauthorized responses. These attacks exploit the temporal nature of dialogue to evade single-turn detection methods, posing a significant risk to the safe deployment of LLMs. This paper introduces the T emporal Context A wareness (TCA)framework, a novel defense mechanism designed to address this challenge by continuously analyzing semantic drift, cross-turn intention consistency, and evolving conversational patterns.The TCA framework integrates dynamic context embedding analysis, cross-turn consistency verification, and progressive risk scoring to detect and mitigate manipulation attempts effectively. Preliminary evaluations on simulated adversarial scenarios demonstrate the framework's potential to identify subtle manipulation patterns often missed by traditional detection techniques, offering a much-needed layer of security for conversational AI systems.In addition to outlining the design of TCA, we analyze diverse attack vectors and their progression across multi-turn conversations, providing valuable insights into adversarial tactics and their impact on LLM vulnerabilities. Our findings underscore the pressing need for robust, context-aware defenses in conversational AI systems and highlight the TCA framework as a promising direction for securing LLMs while preserving their utility in legitimate applications Index T erms --LLM Security, Multi-turn attacks, prompt security, obfuscation, prompt injection, security, trustworthy AI, jailbreak I. I NTRODUCTION Large Language Models (LLMs) have become integral to modern digital infrastructure, powering applications from customer service to healthcare assistance [Chen et al., 2023] [3].

ai system, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.1556

Country: North America > United States > California > Santa Clara County > Mountain View (0.44)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

From Guessing to Asking: An Approach to Resolving the Persona Knowledge Gap in LLMs during Multi-Turn Conversations

Baskar, Sarvesh, Verelakar, Tanmay Tulsidas, Parthasarathy, Srinivasan, Gaur, Manas

arXiv.org Artificial IntelligenceMar-16-2025

In multi-turn dialogues, large language models (LLM) face a critical challenge of ensuring coherence while adapting to user-specific information. This study introduces the persona knowledge gap, the discrepancy between a model's internal understanding and the knowledge required for coherent, personalized conversations. While prior research has recognized these gaps, computational methods for their identification and resolution remain underexplored. We propose Conversation Preference Elicitation and Recommendation (CPER), a novel framework that dynamically detects and resolves persona knowledge gaps using intrinsic uncertainty quantification and feedback-driven refinement. CPER consists of three key modules: a Contextual Understanding Module for preference extraction, a Dynamic Feedback Module for measuring uncertainty and refining persona alignment, and a Persona-Driven Response Generation module for adapting responses based on accumulated user context. We evaluate CPER on two real-world datasets: CCPE-M for preferential movie recommendations and ESConv for mental health support. Using A/B testing, human evaluators preferred CPER's responses 42% more often than baseline models in CCPE-M and 27% more often in ESConv. A qualitative human evaluation confirms that CPER's responses are preferred for maintaining contextual relevance and coherence, particularly in longer (12+ turn) conversations.

knowledge gap, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.12556

Country:

North America > United States > Maryland > Baltimore County (0.14)
North America > United States > Maryland > Baltimore (0.14)
North America > United States > Ohio (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

DICES Dataset: Diversity in Conversational AI Evaluation for Safety

Neural Information Processing SystemsJan-19-2025, 18:13:21 GMT

conversational ai evaluation, dataset, diversity, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Experts-in-the-Loop: Establishing an Effective Workflow in Crafting Privacy Q&A

Kolagar, Zahra, Leschanowsky, Anna Katharina, Popp, Birgit

arXiv.org Artificial IntelligenceNov-18-2023

Privacy policies play a vital role in safeguarding user privacy as legal jurisdictions worldwide emphasize the need for transparent data processing. While the suitability of privacy policies to enhance transparency has been critically discussed, employing conversational AI systems presents unique challenges in informing users effectively. In this position paper, we propose a dynamic workflow for transforming privacy policies into privacy question-and-answer (Q&A) pairs to make privacy policies easily accessible through conversational AI. Thereby, we facilitate interdisciplinary collaboration among legal experts and conversation designers, while also considering the utilization of large language models' generative capabilities and addressing associated challenges. Our proposed workflow underscores continuous improvement and monitoring throughout the construction of privacy Q&As, advocating for comprehensive review and refinement through an experts-in-the-loop approach.

information, privacy policy, workflow, (13 more...)

arXiv.org Artificial Intelligence

2311.11161

Country:

Europe > Germany (0.05)
North America > United States > California (0.04)
Asia > China > Hong Kong (0.04)
(5 more...)

Genre:

Workflow (0.93)
Overview (0.88)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

Add feedback

Expanding the Set of Pragmatic Considerations in Conversational AI

Seals, S. M., Shalin, Valerie L.

arXiv.org Artificial IntelligenceOct-27-2023

Despite considerable performance improvements, current conversational AI systems often fail to meet user expectations. We discuss several pragmatic limitations of current conversational AI systems. We illustrate pragmatic limitations with examples that are syntactically appropriate, but have clear pragmatic deficiencies. We label our complaints as "Turing Test Triggers" (TTTs) as they indicate where current conversational AI systems fall short compared to human behavior. We develop a taxonomy of pragmatic considerations intended to identify what pragmatic competencies a conversational AI system requires and discuss implications for the design and evaluation of conversational AI systems.

conversational ai system, information, proc, (14 more...)

arXiv.org Artificial Intelligence

2310.18435

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > New York (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre:

Instructional Material (0.46)
Research Report (0.40)

Industry:

Health & Medicine (1.00)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
(2 more...)

Add feedback

BiasAsker: Measuring the Bias in Conversational AI System

Wan, Yuxuan, Wang, Wenxuan, He, Pinjia, Gu, Jiazhen, Bai, Haonan, Lyu, Michael

arXiv.org Artificial IntelligenceMay-21-2023

Powered by advanced Artificial Intelligence (AI) techniques, conversational AI systems, such as ChatGPT and digital assistants like Siri, have been widely deployed in daily life. However, such systems may still produce content containing biases and stereotypes, causing potential social problems. Due to the data-driven, black-box nature of modern AI techniques, comprehensively identifying and measuring biases in conversational systems remains a challenging task. Particularly, it is hard to generate inputs that can comprehensively trigger potential bias due to the lack of data containing both social groups as well as biased properties. In addition, modern conversational systems can produce diverse responses (e.g., chatting and explanation), which makes existing bias detection methods simply based on the sentiment and the toxicity hardly being adopted. In this paper, we propose BiasAsker, an automated framework to identify and measure social bias in conversational AI systems. To obtain social groups and biased properties, we construct a comprehensive social bias dataset, containing a total of 841 groups and 8,110 biased properties. Given the dataset, BiasAsker automatically generates questions and adopts a novel method based on existence measurement to identify two types of biases (i.e., absolute bias and related bias) in conversational systems. Extensive experiments on 8 commercial systems and 2 famous research models, such as ChatGPT and GPT-3, show that 32.83% of the questions generated by BiasAsker can trigger biased behaviors in these widely deployed conversational systems. All the code, data, and experimental results have been released to facilitate future research.

biasasker, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2305.12434

Country:

North America > United States > California > San Francisco County > San Francisco (0.05)
Asia > China > Hong Kong (0.05)
Asia > China > Guangdong Province > Shenzhen (0.04)
(4 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Information Technology (1.00)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Discourse over Discourse: The Need for an Expanded Pragmatic Focus in Conversational AI

Seals, S. M., Shalin, Valerie L.

arXiv.org Artificial IntelligenceApr-27-2023

The summarization of conversation, a case of discourse conversational summarization and conversational over a discourse, clearly illustrates a series AI more broadly. We illustrate the remaining challenges of pragmatic limitations in contemporary conversational in this area with ill-conceived examples inspired AI applications. While there has been some by conversational AI systems (Gratch et al., previous work examining pragmatic issues in conversational 2014), conversation summarization models, (Gaur AI (i.e., (Bao et al., 2022; Kim et al., et al., 2021) and author interactions with chatbots 2020, 2021a; Nath, 2020; Wu and Ong, 2021)), and voice assistants. Like Chomsky's star sentences, additional progress depends on understanding the these examples have clear pragmatic deficiencies source of limitations in current applications. We that trigger the Turing Test criterion. No aim to contribute to both theory and applications by competent speaker would construct such discourse.

computational linguistic, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2304.14543

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
(16 more...)

Genre:

Research Report (0.82)
Instructional Material > Course Syllabus & Notes (0.46)

Industry:

Health & Medicine (0.93)
Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(3 more...)

Add feedback

Safer Conversational AI as a Source of User Delight

Lu, Xiaoding, Korshuk, Aleksey, Liu, Zongyi, Beauchamp, William, Research, Chai

arXiv.org Artificial IntelligenceApr-18-2023

This work explores the impact of moderation on users' enjoyment of conversational AI systems. While recent advancements in Large Language Models (LLMs) have led to highly capable conversational AIs that are increasingly deployed in real-world settings, there is a growing concern over AI safety and the need to moderate systems to encourage safe language and prevent harm. However, some users argue that current approaches to moderation limit the technology, compromise free expression, and limit the value delivered by the technology. This study takes an unbiased stance and shows that moderation does not necessarily detract from user enjoyment. Heavy handed moderation does seem to have a nefarious effect, but models that are moderated to be safer can lead to a better user experience. By deploying various conversational AIs in the Chai platform, the study finds that user retention can increase with a level of moderation and safe system design. These results demonstrate the importance of appropriately defining safety in models in a way that is both responsible and focused on serving users.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2304.09865

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.04)
(8 more...)

Genre:

Research Report (0.70)
Overview > Growing Problem (0.34)

Industry: Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Prompt Engineering -- Backbone of Generative AI

#artificialintelligenceApr-11-2023, 04:37:10 GMT

In today’s digital age, the internet is abuzz with the latest advancements in AI, including ChatGPT — a new content creator that has taken the online world by storm. As an engineer, it is crucial to…

engineering, information, language model, (11 more...)

#artificialintelligence

Country: Asia > India (0.05)

Industry:

Media > Film (0.52)
Leisure & Entertainment (0.52)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.54)

Add feedback